174 research outputs found
Scalable discovery of hybrid process models in a cloud computing environment
Process descriptions are used to create products and deliver services. To lead better processes and services, the first step
is to learn a process model. Process discovery is such a technique which can automatically extract process models from event logs.
Although various discovery techniques have been proposed, they focus on either constructing formal models which are very powerful
but complex, or creating informal models which are intuitive but lack semantics. In this work, we introduce a novel method that returns
hybrid process models to bridge this gap. Moreover, to cope with today’s big event logs, we propose an efficient method, called f-HMD,
aims at scalable hybrid model discovery in a cloud computing environment. We present the detailed implementation of our approach
over the Spark framework, and our experimental results demonstrate that the proposed method is efficient and scalabl
Alignment-based trace clustering
A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.Peer ReviewedPostprint (author's final draft
Prefix Imputation of Orphan Events in Event Stream Processing
In the context of process mining, event logs consist of process instances called cases. Conformance checking is a process mining task that inspects whether a log file is conformant with an existing process model. This inspection is additionally quantifying the conformance in an explainable manner. Online conformance checking processes streaming event logs by having precise insights into the running cases and timely mitigating non-conformance, if any. State-of-the-art online conformance checking approaches bound the memory by either delimiting storage of the events per case or limiting the number of cases to a specific window width. The former technique still requires unbounded memory as the number of cases to store is unlimited, while the latter technique forgets running, not yet concluded, cases to conform to the limited window width. Consequently, the processing system may later encounter events that represent some intermediate activity as per the process model and for which the relevant case has been forgotten, to be referred to as orphan events. The naïve approach to cope with an orphan event is to either neglect its relevant case for conformance checking or treat it as an altogether new case. However, this might result in misleading process insights, for instance, overestimated non-conformance. In order to bound memory yet effectively incorporate the orphan events into processing, we propose an imputation of missing-prefix approach for such orphan events. Our approach utilizes the existing process model for imputing the missing prefix. Furthermore, we leverage the case storage management to increase the accuracy of the prefix prediction. We propose a systematic forgetting mechanism that distinguishes and forgets the cases that can be reliably regenerated as prefix upon receipt of their future orphan event. We evaluate the efficacy of our proposed approach through multiple experiments with synthetic and three real event logs while simulating a streaming setting. Our approach achieves considerably higher realistic conformance statistics than the state of the art while requiring the same storage.</p
Exact and Approximated Log Alignments for Processes with Inter-case Dependencies
The execution of different cases of a process is often restricted by
inter-case dependencies through e.g., queueing or shared resources. Various
high-level Petri net formalisms have been proposed that are able to model and
analyze coevolving cases. In this paper, we focus on a formalism tailored to
conformance checking through alignments, which introduces challenges related to
constraints the model should put on interacting process instances and on
resource instances and their roles. We formulate requirements for modeling and
analyzing resource-constrained processes, compare several Petri net extensions
that allow for incorporating inter-case constraints. We argue that the Resource
Constrained -net is an appropriate formalism to be used the context of
conformance checking, which traditionally aligns cases individually failing to
expose deviations on inter-case dependencies. We provide formal mathematical
foundations of the globally aligned event log based on theory of partially
ordered sets and propose an approximation technique based on the composition of
individually aligned cases that resolves inter-case violations locally
Native Directly Follows Operator
Typical legacy information systems store data in relational databases.
Process mining is a research discipline that analyzes this data to obtain
insights into processes. Many different process mining techniques can be
applied to data. In current techniques, an XES event log serves as a basis for
analysis. However, because of the static characteristic of an XES event log, we
need to create one XES file for each process mining question, which leads to
overhead and inflexibility. As an alternative, people attempt to perform
process mining directly on the data source using so-called intermediate
structures. In previous work, we investigated methods to build intermediate
structures on source data by executing a basic SQL query on the database.
However, the nested form in the SQL query can cause performance issues on the
database side. Therefore, in this paper, we propose a native SQL operator for
direct process discovery on relational databases. We define a native operator
for the simplest form of the intermediate structure, called the "directly
follows relation". This approach has been evaluated with big event data and the
experimental results show that it performs faster than the state-of-the-art of
database approaches.Comment: 12 page
Conformance Checking of Mixed-paradigm Process Models
Mixed-paradigm process models integrate strengths of procedural and
declarative representations like Petri nets and Declare. They are specifically
interesting for process mining because they allow capturing complex behaviour
in a compact way. A key research challenge for the proliferation of
mixed-paradigm models for process mining is the lack of corresponding
conformance checking techniques. In this paper, we address this problem by
devising the first approach that works with intertwined state spaces of
mixed-paradigm models. More specifically, our approach uses an alignment-based
replay to explore the state space and compute trace fitness in a procedural
way. In every state, the declarative constraints are separately updated, such
that violations disable the corresponding activities. Our technique provides
for an efficient replay towards an optimal alignment by respecting all
orthogonal Declare constraints. We have implemented our technique in ProM and
demonstrate its performance in an evaluation with real-world event logs.Comment: Accepted for publication in Information System
Alignment-based Quality Metrics in Conformance Checking
International audienceThe holy grail in process mining is a process discovery algorithm that, given an event log, produces fitting, precise, properly generalizing and simple process models. Within the field of process mining, conformance checking is considered to be anything where observed behaviour, e.g., in the form of event logs or event streams, needs to be related to already modelled behaviour. In the conformance checking domain, the relation between an event log and a model is typically quantified using fitness, precision and generalization. In this paper, we present metrics for fitness, precision and generalization, based on alignments and the newer concept named anti-alignments
Aligning Modeled and Observed Behavior: A Compromise Between Complexity and Quality
International audienceCertifying that a process model is aligned with the real process executions is perhaps the most desired feature a process model may have: aligned process models are crucial for organizations, since strategic decisions can be made easier on models instead of on plain data. In spite of its importance, the current algorithmic support for computing alignments is limited: either techniques that explicitly explore the model behavior (which may be worst-case exponential with respect to the model size), or heuristic approaches that cannot guarantee a solution, are the only alternatives. In this paper we propose a solution that sits right in the middle in the complexity spectrum of alignment techniques; it can always guarantee a solution, whose quality depends on the exploration depth used and local decisions taken at each step. We use linear algebraic techniques in combination with an iterative search which focuses on progressing towards a solution. The experiments show a clear reduction in the time required for reaching a solution, without sacrificing significantly the quality of the alignment obtained
Performance Analysis of Business Process Models with Advanced Constructs
The importance of actively managing and analysing business processes is acknowledged more than ever in or- ganisations nowadays. Business processes form an essential part of an organisation and their application areas are manifold. Most organisations keep records of various activities that have been carried out for auditing pur- poses, but they are rarely used for analysis purposes. This paper describes the design and implementation of a process analysis tool that replays, analyses and visualises a variety of performance metrics using a process definition and its corresponding execution logs. The replayer uses a YAWL process model example to demon- strate its capacity to support advanced language constructs
- …